skip to main content


Search for: All records

Creators/Authors contains: "Zhang, He"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Many RNAs function through RNA–RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA–RNA interaction is useful, however, existing tools are either too simplistic or too slow. To address this issue, we present LinearCoFold, which approximates the complete minimum free energy structure of two strands in linear time, and LinearCoPartition, which approximates the cofolding partition function and base pairing probabilities in linear time. LinearCoFold and LinearCoPartition are orders of magnitude faster than RNAcofold. For example, on a sequence pair with combined length of 26,190 nt, LinearCoFold is 86.8× faster than RNAcofold MFE mode, and LinearCoPartition is 642.3× faster than RNAcofold partition function mode. Surprisingly, LinearCoFold and LinearCoPartition’s predictions have higher PPV and sensitivity of intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the RNA–RNA interaction between SARS-CoV-2 genomic RNA (gRNA) and human U4 small nuclear RNA (snRNA), which has been experimentally studied, and observe that LinearCoFold’s prediction correlates better with the wet lab results than RNAcofold’s.

     
    more » « less
  2. Synthetic matrices with dynamic presentation of cell guidance cues are needed for the development of physiologically relevant in vitro tumor models. Towards the goal of mimicking prostate cancer progression and metastasis, we engineered a tunable hyaluronic acid-based hydrogel platform with protease degradable and cell adhesive properties employing bioorthogonal tetrazine ligation with strained alkenes. The synthetic matrix was first fabricated via a slow tetrazine-norbornene reaction, then temporally modified via a diffusion-controlled method using trans-cyclooctene, a fierce dienophile that reacts with tetrazine with an unusually fast rate. The encapsulated DU145 prostate cancer single cells spontaneously formed multicellular tumoroids after 7 days of culture. In situ modification of the synthetic matrix via covalent tagging of cell adhesive RGD peptide induced tumoroid decompaction and the development of cellular protrusions. RGD tagging did not compromise the overall cell viability, nor did it induce cell apoptosis. In response to increased matrix adhesiveness, DU145 cells dynamically loosen cell-cell adhesion and strengthen cell-matrix interactions to promote an invasive phenotype. Characterization of the 3D cultures by immunocytochemistry and gene expression analyses demonstrated that cells invaded into the matrix via a mesenchymal like migration, with upregulation of major mesenchymal markers, and down regulation of epithelial markers. The tumoroids formed cortactin positive invadopodia like structures, indicating active matrix remodeling. Overall, the engineered tumor model can be utilized to identify potential molecular targets and test pharmacological inhibitors, thereby accelerating the design of innovative strategies for cancer therapeutics. 
    more » « less
    Free, publicly-accessible full text available August 1, 2024
  3. Abstract

    Many RNAs fold into multiple structures at equilibrium, and there is a need to sample these structures according to their probabilities in the ensemble. The conventional sampling algorithm suffers from two limitations: (i) the sampling phase is slow due to many repeated calculations; and (ii) the end-to-end runtime scales cubically with the sequence length. These issues make it difficult to be applied to long RNAs, such as the full genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To address these problems, we devise a new sampling algorithm, LazySampling, which eliminates redundant work via on-demand caching. Based on LazySampling, we further derive LinearSampling, an end-to-end linear time sampling algorithm. Benchmarking on nine diverse RNA families, the sampled structures from LinearSampling correlate better with the well-established secondary structures than Vienna RNAsubopt and RNAplfold. More importantly, LinearSampling is orders of magnitude faster than standard tools, being 428× faster (72 s versus 8.6 h) than RNAsubopt on the full genome of SARS-CoV-2 (29 903 nt). The resulting sample landscape correlates well with the experimentally guided secondary structure models, and is closer to the alternative conformations revealed by experimentally driven analysis. Finally, LinearSampling finds 23 regions of 15 nt with high accessibilities in the SARS-CoV-2 genome, which are potential targets for COVID-19 diagnostics and therapeutics.

     
    more » « less
  4. Changes in developmental gene regulatory networks (dGRNs) underlie much of the diversity of life, but the evolutionary mechanisms that operate on interactions with these networks remain poorly understood. Closely related species with extreme phenotypic divergence provide a valuable window into the genetic and molecular basis for changes in dGRNs and their relationship to adaptive changes in organismal traits. Here we analyze genomes, epigenomes, and transcriptomes during early development in two sea urchin species in the genus Heliocidaris that exhibit highly divergent life histories and in an outgroup species. Signatures of positive selection and changes in chromatin status within putative gene regulatory elements are both enriched on the branch leading to the derived life history, and particularly so near core dGRN genes; in contrast, positive selection within protein-coding regions have at most a modest enrichment in branch and function. Single-cell transcriptomes reveal a dramatic delay in cell fate specification in the derived state, which also has far fewer open chromatin regions, especially near dGRN genes with conserved roles in cell fate specification. Experimentally perturbing the function of three key transcription factors reveals profound evolutionary changes in the earliest events that pattern the embryo, disrupting regulatory interactions previously conserved for ~225 million years. Together, these results demonstrate that natural selection can rapidly reshape developmental gene expression on a broad scale when selective regimes abruptly change and that even highly conserved dGRNs and patterning mechanisms in the early embryo remain evolvable under appropriate ecological circumstances. 
    more » « less
  5. The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold’s purely in silico prediction not only is close to experimentally guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5 ′ and 3 ′ untranslated regions (UTRs) (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics. 
    more » « less